836 research outputs found
MetaViewer: Towards A Unified Multi-View Representation
Existing multi-view representation learning methods typically follow a
specific-to-uniform pipeline, extracting latent features from each view and
then fusing or aligning them to obtain the unified object representation.
However, the manually pre-specify fusion functions and view-private redundant
information mixed in features potentially degrade the quality of the derived
representation. To overcome them, we propose a novel
bi-level-optimization-based multi-view learning framework, where the
representation is learned in a uniform-to-specific manner. Specifically, we
train a meta-learner, namely MetaViewer, to learn fusion and model the
view-shared meta representation in outer-level optimization. Start with this
meta representation, view-specific base-learners are then required to rapidly
reconstruct the corresponding view in inner-level. MetaViewer eventually
updates by observing reconstruction processes from uniform to specific over all
views, and learns an optimal fusion scheme that separates and filters out
view-private information. Extensive experimental results in downstream tasks
such as classification and clustering demonstrate the effectiveness of our
method.Comment: 8 pages, 5 figures, conferenc
Applications of Federated Learning in Smart Cities: Recent Advances, Taxonomy, and Open Challenges
Federated learning plays an important role in the process of smart cities.
With the development of big data and artificial intelligence, there is a
problem of data privacy protection in this process. Federated learning is
capable of solving this problem. This paper starts with the current
developments of federated learning and its applications in various fields. We
conduct a comprehensive investigation. This paper summarize the latest research
on the application of federated learning in various fields of smart cities.
In-depth understanding of the current development of federated learning from
the Internet of Things, transportation, communications, finance, medical and
other fields. Before that, we introduce the background, definition and key
technologies of federated learning. Further more, we review the key
technologies and the latest results. Finally, we discuss the future
applications and research directions of federated learning in smart cities
Learning to Learn Kernels with Variational Random Features
In this work, we introduce kernels with random Fourier features in the
meta-learning framework to leverage their strong few-shot learning ability. We
propose meta variational random features (MetaVRF) to learn adaptive kernels
for the base-learner, which is developed in a latent variable model by treating
the random feature basis as the latent variable. We formulate the optimization
of MetaVRF as a variational inference problem by deriving an evidence lower
bound under the meta-learning framework. To incorporate shared knowledge from
related tasks, we propose a context inference of the posterior, which is
established by an LSTM architecture. The LSTM-based inference network can
effectively integrate the context information of previous tasks with
task-specific information, generating informative and adaptive features. The
learned MetaVRF can produce kernels of high representational power with a
relatively low spectral sampling rate and also enables fast adaptation to new
tasks. Experimental results on a variety of few-shot regression and
classification tasks demonstrate that MetaVRF delivers much better, or at least
competitive, performance compared to existing meta-learning alternatives.Comment: ICML'2020; code is available in:
https://github.com/Yingjun-Du/MetaVR
NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning
Panoptic Narrative Detection (PND) and Segmentation (PNS) are two challenging
tasks that involve identifying and locating multiple targets in an image
according to a long narrative description. In this paper, we propose a unified
and effective framework called NICE that can jointly learn these two panoptic
narrative recognition tasks. Existing visual grounding tasks use a two-branch
paradigm, but applying this directly to PND and PNS can result in prediction
conflict due to their intrinsic many-to-many alignment property. To address
this, we introduce two cascading modules based on the barycenter of the mask,
which are Coordinate Guided Aggregation (CGA) and Barycenter Driven
Localization (BDL), responsible for segmentation and detection, respectively.
By linking PNS and PND in series with the barycenter of segmentation as the
anchor, our approach naturally aligns the two tasks and allows them to
complement each other for improved performance. Specifically, CGA provides the
barycenter as a reference for detection, reducing BDL's reliance on a large
number of candidate boxes. BDL leverages its excellent properties to
distinguish different instances, which improves the performance of CGA for
segmentation. Extensive experiments demonstrate that NICE surpasses all
existing methods by a large margin, achieving 4.1% for PND and 2.9% for PNS
over the state-of-the-art. These results validate the effectiveness of our
proposed collaborative learning strategy. The project of this work is made
publicly available at https://github.com/Mr-Neko/NICE.Comment: 18 pages. 9 figures, 9 table
Attentional Prototype Inference for Few-Shot Segmentation
This paper aims to address few-shot segmentation. While existing
prototype-based methods have achieved considerable success, they suffer from
uncertainty and ambiguity caused by limited labeled examples. In this work, we
propose attentional prototype inference (API), a probabilistic latent variable
framework for few-shot segmentation. We define a global latent variable to
represent the prototype of each object category, which we model as a
probabilistic distribution. The probabilistic modeling of the prototype
enhances the model's generalization ability by handling the inherent
uncertainty caused by limited data and intra-class variations of objects. To
further enhance the model, we introduce a local latent variable to represent
the attention map of each query image, which enables the model to attend to
foreground objects while suppressing the background. The optimization of the
proposed model is formulated as a variational Bayesian inference problem, which
is established by amortized inference networks. We conduct extensive
experiments on four benchmarks, where our proposal obtains at least competitive
and often better performance than state-of-the-art prototype-based methods. We
also provide comprehensive analyses and ablation studies to gain insight into
the effectiveness of our method for few-shot segmentation.Comment: Pattern Recognition Journa
- …